RESUMEN
The association of an individual's genetic makeup with their response to drugs is referred to as pharmacogenomics. By understanding the relationship between genetic variants and drug efficacy or toxicity, we are able to optimize pharmacological therapy according to an individual's genotype. Pharmacogenomics research has historically suffered from bias and underrepresentation of people from certain ancestry groups and of the female sex. These biases can arise from factors such as drugs and indications studied, selection of study participants, and methods used to collect and analyze data. To examine the representation of biogeographical populations in pharmacogenomic data sets, we describe individuals involved in gene-drug response studies from PharmGKB, a leading repository of drug-gene annotations, and showcaseCYP2D6, a gene that metabolizes approximately 25% of all prescribed drugs. We also show how the historical underrepresentation of females in clinical trials has led to significantly more adverse drug reactions in females than in males.
Asunto(s)
Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos , Sexismo , Masculino , Humanos , Femenino , FarmacogenéticaRESUMEN
A lack of diversity in genomics for health continues to hinder equitable leadership and access to precision medicine approaches for underrepresented populations. To avoid perpetuating biases within the genomics workforce and genomic data collection practices, equity, diversity, and inclusion (EDI) must be addressed. This paper documents the journey taken by the Global Alliance for Genomics and Health (a genomics-based standard-setting and policy-framing organization) to create a more equitable, diverse, and inclusive environment for its standards and members. Initial steps include the creation of two groups: the Equity, Diversity, and Inclusion Advisory Group and the Regulatory and Ethics Diversity Group. Following a framework that we call "Reflected in our Teams, Reflected in our Standards," both groups address EDI at different stages in their policy development process.
RESUMEN
Characterization of host genetic factors contributing to COVID-19 severity promises advances on drug discovery to fight the disease. Most genetic analyses to date have identified genome-wide significant associations involving loss-of-function variants for immune response pathways. Despite accumulating evidence supporting a role for T cells in COVID-19 severity, no definitive genetic markers have been found to support an involvement of T cell responses. We analyzed 205 whole exomes from both a well-characterized cohort of hospitalized severe COVID-19 patients and controls. Significantly enriched high impact alleles were found for 25 variants within the T cell receptor beta (TRB) locus on chromosome 7. Although most of these alleles were found in heterozygosis, at least three or more in TRBV6-5, TRBV7-3, TRBV7-6, TRBV7-7, and TRBV10-1 suggested a possible TRB loss of function via compound heterozygosis. This loss-of-function in TRB genes supports suboptimal or dysfunctional T cell responses as a major contributor to severe COVID-19 pathogenesis.
RESUMEN
Most genome-wide association studies (GWAS) for lipid traits focus on the separate analysis of lipid traits. Moreover, there are limited GWASs evaluating the genetic variants associated with multiple lipid traits in African ancestry. To further identify and localize loci with pleiotropic effects on lipid traits, we conducted a genome-wide meta-analysis, multi-trait analysis of GWAS (MTAG), and multi-trait fine-mapping (flashfm) in 125,000 individuals of African ancestry. Our meta-analysis and MTAG identified four and 14 novel loci associated with lipid traits, respectively. flashfm yielded an 18% mean reduction in the 99% credible set size compared to single-trait fine-mapping with JAM. Moreover, we identified more genetic variants with a posterior probability of causality >0.9 with flashfm than with JAM. In conclusion, we identified additional novel loci associated with lipid traits, and flashfm reduced the 99% credible set size to identify causal genetic variants associated with multiple lipid traits in African ancestry.
Asunto(s)
Estudio de Asociación del Genoma Completo , Lípidos , Humanos , Población Negra , Lípidos/genética , FenotipoRESUMEN
Polygenic Risk Scores (PRS) (also known as polygenic scores, genetic risk scores or polygenic indexes) capture genetic contributions of a multitude of markers that characterise complex traits. Although their likely application to precision medicine remains to be established, promising advances have included their ability to stratify high risk individuals and targeted screening interventions. Current PRS have been mostly optimised for individuals of Northern European ancestries. If PRS are to become widespread as a tool for healthcare applications, more diverse populations and greater capacity for derived interventions need to be accomplished. In this editorial we aim to attract submissions from the research community that highlight current challenges in development of PRS applications at scale. We also welcome manuscripts that delve into the ethical, social and legal implications that the implementation of PRS may generate.
Asunto(s)
Predisposición Genética a la Enfermedad , Herencia Multifactorial , Humanos , Factores de Riesgo , Genómica , Población Blanca , Estudio de Asociación del Genoma CompletoRESUMEN
Around 10% of adults infected with SARS-CoV-2 that survive a first episode of COVID-19 appear to experience long-term clinical manifestations. The signs and symptoms of this post-acute COVID-19 syndrome (PACS) include fatigue, dyspnea, joint pain, myalgia, chest pain, cough, anosmia, dysgeusia, headache, depression, anxiety, memory loss, concentration difficulties, and insomnia. These sequelae remind the constellation of clinical manifestations previously recognized as myalgic encephalomyelitis (ME) or chronic fatigue syndrome (CFS). This condition has been described following distinct infectious events, mostly acute viral illnesses. In this way, the pathophysiology of PACS might overlap with mechanisms involved in other post-infectious fatigue syndromes. The risk of PACS is more frequent in women than men. Additional host genetic factors could be involved. There is a dysregulation of multiple body organs and systems, involving the immune system, the coagulation cascade, endocrine organs, autonomic nervous system, microbiota-gut-brain axis, hypothalamic-pituitary-adrenal axis, hypothalamic-pituitary-thyroid axis, etc. Hypothetically, an abnormal response to certain infectious agents could trigger the development of postinfectious fatigue syndromes.
Asunto(s)
COVID-19 , Infecciones por VIH , Adulto , Masculino , Humanos , Femenino , Síndrome Post Agudo de COVID-19 , SARS-CoV-2 , COVID-19/complicaciones , Sistema Hipotálamo-Hipofisario , Sistema Hipófiso-Suprarrenal , Trastornos Post InfecciososRESUMEN
BACKGROUND: Polygenic risk scores (PRS) have been widely applied in research studies, showing how population groups can be stratified into risk categories for many common conditions. As healthcare systems consider applying PRS to keep their populations healthy, little work has been carried out demonstrating their implementation at an individual level. CASE PRESENTATION: We performed a systematic curation of PRS sources from established data repositories, selecting 15 phenotypes, comprising an excess of 37 million SNPs related to cancer, cardiovascular, metabolic and autoimmune diseases. We tested selected phenotypes using whole genome sequencing data for a family of four related individuals. Individual risk scores were given percentile values based upon reference distributions among 1000 Genomes Iberians, Europeans, or all samples. Over 96 billion allele effects were calculated in order to obtain the PRS for each of the individuals analysed here. CONCLUSIONS: Our results highlight the need for further standardisation in the way PRS are developed and shared, the importance of individual risk assessment rather than the assumption of inherited averages, and the challenges currently posed when translating PRS into risk metrics.
Asunto(s)
Estudio de Asociación del Genoma Completo , Herencia Multifactorial , Alelos , Predisposición Genética a la Enfermedad , Humanos , Polimorfismo de Nucleótido Simple , Factores de RiesgoRESUMEN
The poor transferability of genetic risk scores (GRSs) derived from European ancestry data in diverse populations is a cause of concern. We set out to evaluate whether GRSs derived from data of African American individuals and multiancestry data perform better in sub-Saharan Africa (SSA) compared to European ancestry-derived scores. Using summary statistics from the Million Veteran Program (MVP), we showed that GRSs derived from data of African American individuals enhance polygenic prediction of lipid traits in SSA compared to European and multiancestry scores. However, our GRS prediction varied greatly within SSA between the South African Zulu (low-density lipoprotein cholesterol (LDL-C), R2 = 8.14%) and Ugandan cohorts (LDL-C, R2 = 0.026%). We postulate that differences in the genetic and environmental factors between these population groups might lead to the poor transferability of GRSs within SSA. More effort is required to optimize polygenic prediction in Africa.
Asunto(s)
Estudio de Asociación del Genoma Completo , Grupos de Población , Población Negra/genética , LDL-Colesterol/genética , Humanos , Factores de RiesgoRESUMEN
Although best practices have emerged on how to analyse and interpret personal genomes, the utility of whole genome screening remains underdeveloped. A large amount of information can be gathered from various types of analyses via whole genome sequencing including pathogenicity screening, genetic risk scoring, fitness, nutrition, and pharmacogenomic analysis. We recognize different levels of confidence when assessing the validity of genetic markers and apply rigorous standards for evaluation of phenotype associations. We illustrate the application of this approach on a family of five. By applying analyses of whole genomes from different methodological perspectives, we are able to build a more comprehensive picture to assist decision making in preventative healthcare and well-being management. Our interpretation and reporting outputs provide input for a clinician to develop a healthcare plan for the individual, based on genetic and other healthcare data.
RESUMEN
Copy number variations (CNVs) are genomic structural variations (deletions, duplications, or translocations) that represent the 4.8-9.5% of human genome variation in healthy individuals. In some cases, CNVs can also lead to disease, being the etiology of many known rare genetic/genomic disorders. Despite the last advances in genomic sequencing and diagnosis, the pathological effects of many rare genetic variations remain unresolved, largely due to the low number of patients available for these cases, making it difficult to identify consistent patterns of genotype-phenotype relationships. We aimed to improve the identification of statistically consistent genotype-phenotype relationships by integrating all the genetic and clinical data of thousands of patients with rare genomic disorders (obtained from the DECIPHER database) into a phenotype-patient-genotype tripartite network. Then we assessed how our network approach could help in the characterization and diagnosis of novel cases in clinical genetics. The systematic approach implemented in this work is able to better define the relationships between phenotypes and specific loci, by exploiting large-scale association networks of phenotypes and genotypes in thousands of rare disease patients. The application of the described methodology facilitated the diagnosis of novel clinical cases, ranking phenotypes by locus specificity and reporting putative new clinical features that may suggest additional clinical follow-ups. In this work, the proof of concept developed over a set of novel clinical cases demonstrates that this network-based methodology might help improve the precision of patient clinical records and the characterization of rare syndromes.
Asunto(s)
Variaciones en el Número de Copia de ADN/genética , Predisposición Genética a la Enfermedad , Genoma Humano/genética , Enfermedades Raras/genética , Mapeo Cromosómico , Hibridación Genómica Comparativa , Bases de Datos Genéticas , Estudios de Asociación Genética , Genotipo , Humanos , Fenotipo , Polimorfismo de Nucleótido Simple/genética , Enfermedades Raras/diagnóstico , Enfermedades Raras/patología , Eliminación de SecuenciaRESUMEN
It is generally acknowledged that, for reproducibility and progress of human genomic research, data sharing is critical. For every sharing transaction, a successful data exchange is produced between a data consumer and a data provider. Providers of human genomic data (e.g., publicly or privately funded repositories and data archives) fulfil their social contract with data donors when their shareable data conforms to FAIR (findable, accessible, interoperable, reusable) principles. Based on our experiences via Repositive (https://repositive.io), a leading discovery platform cataloguing all shared human genomic datasets, we propose guidelines for data providers wishing to maximise their shared data's FAIRness.
Asunto(s)
Bases de Datos Genéticas/normas , Genoma Humano/genética , Genómica/normas , Difusión de la Información , HumanosRESUMEN
BACKGROUND: There is a growing support for the stance that patients and research participants should have better and easier access to their raw (uninterpreted) genomic sequence data in both clinical and research contexts. MAIN BODY: We review legal frameworks and literature on the benefits, risks, and practical barriers of providing individuals access to their data. We also survey genomic sequencing initiatives that provide or plan to provide individual access. Many patients and research participants expect to be able to access their health and genomic data. Individuals have a legal right to access their genomic data in some countries and contexts. Moreover, increasing numbers of participatory research projects, direct-to-consumer genetic testing companies, and now major national sequencing initiatives grant individuals access to their genomic sequence data upon request. CONCLUSION: Drawing on current practice and regulatory analysis, we outline legal, ethical, and practical guidance for genomic sequencing initiatives seeking to offer interested patients and participants access to their raw genomic data.
Asunto(s)
Secuencia de Bases/genética , Genoma Humano/genética , Genómica/legislación & jurisprudencia , Ética en Investigación , Pruebas Genéticas , Genómica/ética , Humanos , Pacientes/legislación & jurisprudencia , Investigación/legislación & jurisprudenciaRESUMEN
BACKGROUND: Root and tuber crops are a major food source in tropical Africa. Among these crops are several species in the monocotyledonous genus Dioscorea collectively known as yam, a staple tuber crop that contributes enormously to the subsistence and socio-cultural lives of millions of people, principally in West and Central Africa. Yam cultivation is constrained by several factors, and yam can be considered a neglected "orphan" crop that would benefit from crop improvement efforts. However, the lack of genetic and genomic tools has impeded the improvement of this staple crop. RESULTS: To accelerate marker-assisted breeding of yam, we performed genome analysis of white Guinea yam (Dioscorea rotundata) and assembled a 594-Mb genome, 76.4% of which was distributed among 21 linkage groups. In total, we predicted 26,198 genes. Phylogenetic analyses with 2381 conserved genes revealed that Dioscorea is a unique lineage of monocotyledons distinct from the Poales (rice), Arecales (palm), and Zingiberales (banana). The entire Dioscorea genus is characterized by the occurrence of separate male and female plants (dioecy), a feature that has limited efficient yam breeding. To infer the genetics of sex determination, we performed whole-genome resequencing of bulked segregants (quantitative trait locus sequencing [QTL-seq]) in F1 progeny segregating for male and female plants and identified a genomic region associated with female heterogametic (male = ZZ, female = ZW) sex determination. We further delineated the W locus and used it to develop a molecular marker for sex identification of Guinea yam plants at the seedling stage. CONCLUSIONS: Guinea yam belongs to a unique and highly differentiated clade of monocotyledons. The genome analyses and sex-linked marker development performed in this study should greatly accelerate marker-assisted breeding of Guinea yam. In addition, our QTL-seq approach can be utilized in genetic studies of other outcrossing crops and organisms with highly heterozygous genomes. Genomic analysis of orphan crops such as yam promotes efforts to improve food security and the sustainability of tropical agriculture.
Asunto(s)
Dioscorea/genética , Genoma de Planta , Biomarcadores/metabolismo , Productos Agrícolas/genética , Fitomejoramiento , Sitios de Carácter Cuantitativo , Secuenciación Completa del GenomaRESUMEN
Scientific research relies on computer software, yet software is not always developed following practices that ensure its quality and sustainability. This manuscript does not aim to propose new software development best practices, but rather to provide simple recommendations that encourage the adoption of existing best practices. Software development best practices promote better quality software, and better quality software improves the reproducibility and reusability of research. These recommendations are designed around Open Source values, and provide practical suggestions that contribute to making research software and its source code more discoverable, reusable and transparent. This manuscript is aimed at developers, but also at organisations, projects, journals and funders that can increase the quality and sustainability of research software by encouraging the adoption of these recommendations.
RESUMEN
SUMMARY: The vast, uncoordinated proliferation of bioinformatics resources (databases, software tools, training materials etc.) makes it difficult for users to find them. To facilitate their discovery, various services are being developed to collect such resources into registries. We have developed BioCIDER, which, rather like online shopping 'recommendations', provides a contextualization index to help identify biological resources relevant to the content of the sites in which it is embedded. AVAILABILITY AND IMPLEMENTATION: BioCIDER (www.biocider.org) is an open-source platform. Documentation is available online (https://goo.gl/Klc51G), and source code is freely available via GitHub (https://github.com/BioCIDER). The BioJS widget that enables websites to embed contextualization is available from the BioJS registry (http://biojs.io/). All code is released under an MIT licence. CONTACT: carlos.horro@earlham.ac.uk or rafael.jimenez@elixir-europe.org or manuel@repositive.io.
Asunto(s)
Biología Computacional/métodos , Bases de Datos Factuales , Programas InformáticosRESUMEN
Metrics for assessing adoption of good development practices are a useful way to ensure that software is sustainable, reusable and functional. Sustainability means that the software used today will be available - and continue to be improved and supported - in the future. We report here an initial set of metrics that measure good practices in software development. This initiative differs from previously developed efforts in being a community-driven grassroots approach where experts from different organisations propose good software practices that have reasonable potential to be adopted by the communities they represent. We not only focus our efforts on understanding and prioritising good practices, we assess their feasibility for implementation and publish them here.
RESUMEN
BACKGROUND: Network medicine is a promising new discipline that combines systems biology approaches and network science to understand the complexity of pathological phenotypes. Given the growing availability of personalized genomic and phenotypic profiles, network models offer a robust integrative framework for the analysis of "omics" data, allowing the characterization of the molecular aetiology of pathological processes underpinning genetic diseases. METHODS: Here we make use of patient genomic data to exploit different network-based analyses to study genetic and phenotypic relationships between individuals. For this method, we analyzed a dataset of structural variants and phenotypes for 6,564 patients from the DECIPHER database, which encompasses one of the most comprehensive collections of pathogenic Copy Number Variations (CNVs) and their associated ontology-controlled phenotypes. We developed a computational strategy that identifies clusters of patients in a synthetic patient network according to their genetic overlap and phenotype enrichments. RESULTS: Many of these clusters of patients represent new genotype-phenotype associations, suggesting the identification of newly discovered phenotypically enriched loci (indicative of potential novel syndromes) that are currently absent from reference genomic disorder databases such as ClinVar, OMIM or DECIPHER itself. CONCLUSIONS: We provide a high-resolution map of pathogenic phenotypes associated with their respective significant genomic regions and a new powerful tool for diagnosis of currently uncharacterized mutations leading to deleterious phenotypes and syndromes.
Asunto(s)
Variaciones en el Número de Copia de ADN , Enfermedades Genéticas Congénitas/genética , Genómica/métodos , Fenotipo , Estudios de Casos y Controles , Bases de Datos Genéticas , Estudios de Asociación Genética , Sitios Genéticos , Humanos , MutaciónRESUMEN
BACKGROUND: We describe the pioneering experience of a Spanish family pursuing the goal of understanding their own personal genetic data to the fullest possible extent using Direct to Consumer (DTC) tests. With full informed consent from the Corpas family, all genotype, exome and metagenome data from members of this family, are publicly available under a public domain Creative Commons 0 (CC0) license waiver. All scientists or companies analysing these data ("the Corpasome") were invited to return results to the family. METHODS: We released 5 genotypes, 4 exomes, 1 metagenome from the Corpas family via a blog and figshare under a public domain license, inviting scientists to join the crowdsourcing efforts to analyse the genomes in return for coauthorship or acknowldgement in derived papers. Resulting analysis data were compiled via social media and direct email. RESULTS: Here we present the results of our investigations, combining the crowdsourced contributions and our own efforts. Four companies offering annotations for genomic variants were applied to four family exomes: BIOBASE, Ingenuity, Diploid, and GeneTalk. Starting from a common VCF file and after selecting for significant results from company reports, we find no overlap among described annotations. We additionally report on a gut microbiome analysis of a member of the Corpas family. CONCLUSIONS: This study presents an analysis of a diverse set of tools and methods offered by four DTC companies. The striking discordance of the results mirrors previous findings with respect to DTC analysis of SNP chip data, and highlights the difficulties of using DTC data for preventive medical care. To our knowledge, the data and analysis results from our crowdsourced study represent the most comprehensive exome and analysis for a family quartet using solely DTC data generation to date.
Asunto(s)
Colaboración de las Masas , Familia , Pruebas Genéticas , Genómica , Colaboración de las Masas/métodos , Exoma , Femenino , Frecuencia de los Genes , Pruebas Genéticas/métodos , Genómica/métodos , Genotipo , Humanos , Masculino , Metagenoma , Linaje , Fenotipo , Polimorfismo de Nucleótido Simple , Medicina de Precisión/métodos , Carácter Cuantitativo Heredable , EspañaRESUMEN
BioJS is an open source software project that develops visualization tools for different types of biological data. Here we report on the factors that influenced the growth of the BioJS user and developer community, and outline our strategy for building on this growth. The lessons we have learned on BioJS may also be relevant to other open source software projects.